Reads were aligned to the hg38 assembly using STAR (2.5.2b). The following alignment QC report was produced:

SRP033351_QC_RnaSeqReport.html

HTSeq (0.6.1) function htseq-count was used to count reads. Counts for all samples were concatenated into the following text file:

SRP033351_htseq_gene.txt

DESeq2 (1.28.1) was used for differential gene expression analaysis, based on the HTSeq counts matrix and the phenotype file provided. Normalized counts from DESeq2 are saved in the following text file:

SRP033351_counts_normalized_by_DESeq2.txt

Normalized counts are obtained from DESeq2 function estimateSizeFactors(), which divides counts by the geometric mean across samples; this function does not correct for read length. The normalization method is described in detail here: https://genomebiology.biomedcentral.com/articles/10.1186/gb-2010-11-10-r106

Differential gene expression analysis was done for all comparisons provided in the comparisons file. The following design was used:

design = ~ Donor + Status

If desired, the design can be modified to include more independent variables. In addition to the partial results displayed in this report, the full set of DESeq2 results for each comparison was saved down in separate text files, with names of the form:

SRP033351_CASE_vs_CONTROL_DESeq2_results.txt

where CASE and CONTROL are pairs of conditions specified in the comparisons file.

Convert KEGG and REACTOME pathway files in .gmt format provided by MSigDB into a pathway list

## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in design formula are characters, converting to factors

alb vs. untreated

Samples in this comparison

DE analysis

## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in design formula are characters, converting to factors
Comparison Summary
Status Count
untreated 4
alb 4

Top 50 genes by p-value

Description of DESEq2 output

Volcano plots

Volcano plot (probes with a q-value <0.05 are present in red)

MA plot

Distribution of adjusted p-values

Dendrogram based on sample distance of regularized log transformed data

Heatmaps for top 30 significant genes

Genes were ranked by adjusted p-values.

Principal component analysis (PCA) Plot based on regularized log transformed data

Compute PCs and variance explained by the first 10 PCs

Variance explained
PC Proportion of Variance (%) Cumulative Proportion of Variance (%)
PC1 50.56 50.56
PC2 24.46 75.01
PC3 14.57 89.59
PC4 6.765 96.35
PC5 2.517 98.87
PC6 0.7879 99.66
PC7 0.3441 100
PC8 4.748e-29 100

PCA plots are generated using the first two principle components colored by known factors (e.g. Status, Tissue, or Donor)

Dispersion plot

Plot of the maximum Cook’s distance per gene over the rank of the Wald statistics for the condition

Boxplots for top 20 differentially expressed genes

Genes were ranked by pvalue. Counts have been normalized by sequencing depth, with pseudocount of 0.5 added to allow for log scale plotting, using DESeq2 function plotCounts().

Favorite genes

Boxplots for user-defined favorite genes if they exist, and show DE results

Gene-set enrichment analysis

View top main pathways in fgsea results.

## 38 main pathways are significant.

Generate barplot if pathways with absoluate NES>=2 or top 10 pathways if no pathways pass the threshold

View leading edges in top pathways. Select top five pathways with positive and negative NES respectively

dex vs. untreated

Samples in this comparison

DE analysis

## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in design formula are characters, converting to factors
Comparison Summary
Status Count
untreated 4
dex 4

Top 50 genes by p-value

Description of DESEq2 output

Volcano plots

Volcano plot (probes with a q-value <0.05 are present in red)

MA plot

Distribution of adjusted p-values

Dendrogram based on sample distance of regularized log transformed data

Heatmaps for top 30 significant genes

Genes were ranked by adjusted p-values.

Principal component analysis (PCA) Plot based on regularized log transformed data

Compute PCs and variance explained by the first 10 PCs

Variance explained
PC Proportion of Variance (%) Cumulative Proportion of Variance (%)
PC1 45.79 45.79
PC2 26.59 72.38
PC3 14.57 86.95
PC4 9.28 96.23
PC5 2.044 98.28
PC6 1.109 99.39
PC7 0.6137 100
PC8 2.87e-29 100

PCA plots are generated using the first two principle components colored by known factors (e.g. Status, Tissue, or Donor)

Dispersion plot

Plot of the maximum Cook’s distance per gene over the rank of the Wald statistics for the condition

Boxplots for top 20 differentially expressed genes

Genes were ranked by pvalue. Counts have been normalized by sequencing depth, with pseudocount of 0.5 added to allow for log scale plotting, using DESeq2 function plotCounts().

Favorite genes

Boxplots for user-defined favorite genes if they exist, and show DE results

Gene-set enrichment analysis

View top main pathways in fgsea results.

## 23 main pathways are significant.

Generate barplot if pathways with absoluate NES>=2 or top 10 pathways if no pathways pass the threshold

View leading edges in top pathways. Select top five pathways with positive and negative NES respectively

alb_dex vs. untreated

Samples in this comparison

DE analysis

## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in design formula are characters, converting to factors
Comparison Summary
Status Count
untreated 4
alb_dex 4

Top 50 genes by p-value

Description of DESEq2 output

Volcano plots

Volcano plot (probes with a q-value <0.05 are present in red)

MA plot

Distribution of adjusted p-values

Dendrogram based on sample distance of regularized log transformed data

Heatmaps for top 30 significant genes

Genes were ranked by adjusted p-values.

Principal component analysis (PCA) Plot based on regularized log transformed data

Compute PCs and variance explained by the first 10 PCs

Variance explained
PC Proportion of Variance (%) Cumulative Proportion of Variance (%)
PC1 46.28 46.28
PC2 24.3 70.58
PC3 16.38 86.96
PC4 9.872 96.84
PC5 1.424 98.26
PC6 0.9661 99.22
PC7 0.7752 100
PC8 5.248e-29 100

PCA plots are generated using the first two principle components colored by known factors (e.g. Status, Tissue, or Donor)

Dispersion plot

Plot of the maximum Cook’s distance per gene over the rank of the Wald statistics for the condition

Boxplots for top 20 differentially expressed genes

Genes were ranked by pvalue. Counts have been normalized by sequencing depth, with pseudocount of 0.5 added to allow for log scale plotting, using DESeq2 function plotCounts().

Favorite genes

Boxplots for user-defined favorite genes if they exist, and show DE results

Gene-set enrichment analysis

View top main pathways in fgsea results.

## 52 main pathways are significant.

Generate barplot if pathways with absoluate NES>=2 or top 10 pathways if no pathways pass the threshold

View leading edges in top pathways. Select top five pathways with positive and negative NES respectively

Housekeeping genes

Counts have been normalized by estimated size factors using DESeq2. Obtain the count matrix using function DESeq2::counts.

The table shows p-values of house-keeping genes for each comparison. Generally, house-keeping gene expressions do not change significantly in different conditions.

Favorite gene expressions in all conditions

R version 4.0.2 (2020-06-22)

Platform: x86_64-pc-linux-gnu (64-bit)

locale: LC_CTYPE=en_US.UTF-8, LC_NUMERIC=C, LC_TIME=en_US.UTF-8, LC_COLLATE=en_US.UTF-8, LC_MONETARY=en_US.UTF-8, LC_MESSAGES=en_US.UTF-8, LC_PAPER=en_US.UTF-8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.UTF-8 and LC_IDENTIFICATION=C

attached base packages: parallel, stats4, stats, graphics, grDevices, utils, datasets, methods and base

other attached packages: pander(v.0.6.3), fgsea(v.1.14.0), biomaRt(v.2.44.1), tidyr(v.1.1.2), DT(v.0.16), DESeq2(v.1.28.1), SummarizedExperiment(v.1.18.2), DelayedArray(v.0.14.1), matrixStats(v.0.57.0), Biobase(v.2.48.0), GenomicRanges(v.1.40.0), GenomeInfoDb(v.1.24.2), IRanges(v.2.22.2), S4Vectors(v.0.26.1), BiocGenerics(v.0.34.0), viridis(v.0.5.1), viridisLite(v.0.3.0), ggplot2(v.3.3.2), RColorBrewer(v.1.1-2), gplots(v.3.1.0) and rmarkdown(v.2.5)

loaded via a namespace (and not attached): bitops(v.1.0-6), bit64(v.4.0.5), progress(v.1.2.2), httr(v.1.4.2), tools(v.4.0.2), R6(v.2.4.1), KernSmooth(v.2.23-17), DBI(v.1.1.0), colorspace(v.1.4-1), withr(v.2.3.0), tidyselect(v.1.1.0), gridExtra(v.2.3), prettyunits(v.1.1.1), bit(v.4.0.4), curl(v.4.3), compiler(v.4.0.2), labeling(v.0.3), caTools(v.1.18.0), scales(v.1.1.1), genefilter(v.1.70.0), askpass(v.1.1), rappdirs(v.0.3.1), stringr(v.1.4.0), digest(v.0.6.25), XVector(v.0.28.0), pkgconfig(v.2.0.3), htmltools(v.0.5.0), dbplyr(v.1.4.4), htmlwidgets(v.1.5.2), rlang(v.0.4.7), RSQLite(v.2.2.1), farver(v.2.0.3), generics(v.0.0.2), jsonlite(v.1.7.1), crosstalk(v.1.1.0.1), BiocParallel(v.1.22.0), gtools(v.3.8.2), dplyr(v.1.0.2), RCurl(v.1.98-1.2), magrittr(v.1.5), GenomeInfoDbData(v.1.2.3), Matrix(v.1.2-18), Rcpp(v.1.0.5), munsell(v.0.5.0), lifecycle(v.0.2.0), stringi(v.1.5.3), yaml(v.2.2.1), zlibbioc(v.1.34.0), BiocFileCache(v.1.12.1), grid(v.4.0.2), blob(v.1.2.1), crayon(v.1.3.4), lattice(v.0.20-41), splines(v.4.0.2), annotate(v.1.66.0), hms(v.0.5.3), locfit(v.1.5-9.4), knitr(v.1.30), pillar(v.1.4.6), geneplotter(v.1.66.0), fastmatch(v.1.1-3), XML(v.3.99-0.5), glue(v.1.4.2), evaluate(v.0.14), data.table(v.1.13.0), vctrs(v.0.3.4), gtable(v.0.3.0), openssl(v.1.4.3), purrr(v.0.3.4), assertthat(v.0.2.1), xfun(v.0.19), xtable(v.1.8-4), survival(v.3.1-12), tibble(v.3.0.3), AnnotationDbi(v.1.50.3), memoise(v.1.1.0) and ellipsis(v.0.3.1)